Introduction to the MT Plateau
نویسنده
چکیده
We describe how we constructed an automatic scoring function for machine translation quality; this function makes use of arbitrarily many pieces of natural language processing software that has been designed to process English language text. By machine-learning values of functions available inside the software and by constructing functions that yield values based upon the software output, we are able to achieve preliminary, positive results in machine-learning the difference between human-produced English and machine-translation English. We suggest how the scoring function may be used for MT system development. Douglas A. Jones 1 Department of Defense 9800 Savage Road, Suite 6514 Fort Meade, MD 20755-6514 Gregory M. Rusk RABA Technologies 10500 Little Patuxent Parkway Columbia, MD 21044 1 Douglas Jones is now at National Institute of Standards & Technology, Gaithersburg, MD 20899, [email protected] extract or create numeric values from each piece of software that corresponds to the degree to which the software was happy with the input. That array of numbers is the heart of our scoring function for Englishness -we are calling these numeric values "indicators" of Englishness. We then use that array of indicators to drive the machine translation development. In this paper we will report on how we have constructed a prototype of this function; in separate work we discuss how to insert this function into a machine-learning regimen designed to maximize the overall quality of the machine translation output. A Reverse Turing Test People can generally tell the difference between human-produced English and machine translation English, assuming all the obvious constraints such as that the reader and writer have command of the language. Whether or not a machine can tell the difference depends of course, on how good the MT system is. Can we get a machine to tell the difference? Of course it depends on how good the MT system is: if it were perfect, neither we nor the machines ought to be able to distinguish them. MT quality being what it is, that is not a problem for us now. An essential first step toward QDMT is what we are calling a "Reverse Turing Test". In the ordinary Turing Test, we want to fool a person into thinking the machine is a person. Here, we are turning that on its head. We want to define a function that can tell the difference between English that a human being has produced versus English that the machine has produced. To construct the test, we use a bilingual parallel aligned corpus: we take the foreign language side and send that through the MT system; then we see if we can define a scoring function that can distinguish the two versions (original English and MT English). With our current indicators and corpus, we can machine-learn a function that behaves as follows: if you hand it a human sentence, it correctly classifies it as human 74% of the time. If you hand it a machine sentence, it correctly classifies it as a machine sentence 57% of the time. In the remainder of the paper, we will step through the details of the experiment; we will also discuss why we 3Obviously the end goal here is to fail this Reverse Turing Test for a "perfect" machine translation system. We are very far away from this, but we would like to use this function to drive the process toward that eventual and fortunate failure. neither expect nor require 100% accuracy for this function. Our boundary tests behave as expected and are shown in the final section -we use the same test to distinguish between English and (a) English word salad, (b) English alphabet soup, (c) Japanese, and (d) the identity case of more human-produced English. Case Study: Japanese-English In this paper, we report on results using a small corpus of 2,340 sentences drawn from the Kenkyusha New Japanese-English Dictionary. It was important in this particular experiment to use a very clean corpus (perfectly aligned and minimally formatted). This case study is situated in a broader context: we have conducted exploratory experiments on samples from several corpora, for example the ARPA MT Evaluation corpus, samples from European Corpus Initiative Data corpus (ECI-1) and others. Since we found that the scoring function was quite sensitive to formatting problems (for example, the presence of tables and sentence segmentation errors cause problems) we are examining a small corpus that is free from these issues. The sentences are on average relatively short (7.0 words per sentence; 37.6 characters/sentence), this makes our task both easier and harder. It is easier because we have overcome the formatting problems. It is harder because the MT system is able to perform much better on the shorter, cleaner sentences than it was on longer sentences with formatting problems. Since the output is better, it is more difficult to define a function that can tell the difference between the original English and the machine translation English. On balance, this corpus is a good one to illustrate our technique.
منابع مشابه
Melatonin effects on the melanophores in adults and tadpoles of Rana cyanophlyctis (Schneider)
Introduction: Effects of melatonin (MT) were comparatively examined on melanophores of isolated skin in adults and tadpole’s tailfin of a frog Rana cyanophlyctis. MT is generally considered as a potent melanophores aggregating hormone besides regulating the sleep wake cycle in vertebrates. Methods: Melanophore size index (MSI) was chosen as a recording parameter of the responses. Concentr...
متن کاملMagnetotelluric Studies of Active Continent–Continent Collisions
Continent–continent collisions are an important tectonic process and have played a fundamental role in the evolution of the modern continents. A combination of geological and geophysical data has provided new constraints on the structure and temporal evolution of these orogens. Magnetotelluric (MT) studies have been an important part of these studies since they can constrain the fluid content a...
متن کاملMelatonin and Alpha Lipoic Acid: Possible Mitigants for Lopinavir/Ritonavir- Induced Renal Toxicity in Male Albino Rats
Introduction: This study evaluated the effects of pretreatments with melatonin (MT), and Alpha Lipoic acid (ALA) on lopinavir/ritonavir (LPV/r) -induced serum levels of creatinine (Cr), urea (U), uric acid (Ua) and kidney levels of malondialdehyde (MDA), superoxide dismutase (SOD), glutathione (GSH) and catalase (CAT) in male albino rats. Effects of treatments with MT and ALA were also evaluate...
متن کاملTrend Analysis of Maximum and Minimum Temperature Variations in Iran Plateau
Extend Abstract Introduction Climate change, as one of the most important global challenges, has focused on the minds of many scholars, scientists, planners and politicians. Indeed, global warming, melting of polar ice masses, rising water levels in the oceans, and its similar phenomena have led climate change to become the special attention focus of scholars and scientists in recent decad...
متن کاملRemoval of methylene blue from aqueous solutions using modified clay
Introduction: Discharging of industrial colored wastewaters especially into aqueous environments can cause adverse effects on aquatic life due to their toxic natures. In this study, montmorillonite modified by hexadecyltrimethyl ammonium bromide (HDTMA-Mt) was used for the adsorption of methylene blue (MB). Materials and Methods: The influence of surfactant loading rate, contact time, pH, adso...
متن کاملGenetic diversities of MT-ND1 and MT-ND2 genes are associated with high-altitude adaptation in yak.
Tibetan yak (Bos grunniens) inhabiting the Qinghai-Tibet Plateau (QTP) where the average altitude is 4000 m, is specially adapted to live at these altitudes. Conversely, cattle (B. taurus) has been found to suffer from high-altitude hypertension or heart failure when exposed to these high altitudes. Two mitochondrial genes, MT-ND1 and MT-ND2, encode two subunits of NADH dehydrogenase play an es...
متن کامل